simon-5502-12-slides

Topics to be covered

  • What you will learn
    • Mathematical formulation of random intercepts model
    • Description of HIV-intervention data
    • Random intercepts model using hiv-intervention data
    • Mathematical formulation of random slopes model
    • Assumptions and complications
    • Sample size justification

Longitudinal data

  • Measurements taken at different times
    • Emphasis in changes over time

Random intercepts model, 1

  • Simplest pattern for longitudinal data
  • \(Y_{ij},\ i=1,...,n;\ j=1,...,k\)
    • n subjects, k time points
  • \(t_j\), time of jth measurement
    • First time is often zero

Random intercepts model, 2

  • \(Y_{ij}=\beta_0+u_{0i}+\beta_1 t_j + \epsilon_{ij}\)
    • \(\beta_0\) and \(\beta_1\) are unknown constants
    • \(u_{0i}\) and \(\epsilon_{ij}\) are normally distributed
      • \(SD(u_{0i})=\sigma_{intercept}\)
      • \(SD(\epsilon_{ij})=\sigma_{error}\)

Random intercepts model, 3

  • \(SD(Y_{ij})=\sqrt{\sigma^2_{intercept}\ +\ \sigma^2_{error}}\)
  • \(Corr(Y_{ij}, Y_{im})=\frac{\sigma^2_{intercept}}{\sigma^2_{intercept}\ +\ \sigma^2_{error}}\)

Random intercepts illustrated, 1

Random intercepts illustrated, 2

Illustration of random intercepts with real data

Break #1

  • What you have learned
    • Mathematical formulation of random intercepts model
  • What’s coming next
    • Description of HIV-intervention data

Description of hiv-intervention data, 1

data_dictionary: hiv-intervention.txt
source: OzDASL website
description: |
  This is a longitudinal study of an intervention in 14-18 adolescents  intended to increase the frequency of condom protected sex. Subjects  were allocated randomly to treatment or control groups. All were evaluated prior to the intervention, immediately  after the intervention, 6 months and  12 months after the intervention.The outcome variable is the logarithm-transformed frequency of condom-protected sex ( log(Y+1) )."

Description of hiv-intervention data, 2

BST:
  label: treatment group
  values:
    '1': BST intervention
    '0': control
Pre:
  label: Log-frequency of protected sex before the intervention
Post:
  label: Log-frequency of protected sex after the intervention
FU6:
  label: Log-frequency of protected sex reported at the 6 months follow-up
FU12:
  label: Log-frequency of protected sex reported at the 12 months follow-up

Glimpse of the hiv-intervention data

Rows: 20
Columns: 5
$ BST  <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
$ Pre  <dbl> 7, 25, 50, 16, 33, 10, 13, 22, 4, 17, 0, 69, 5, 4, 35, 7, 51, 25,…
$ Post <dbl> 22, 10, 36, 38, 25, 7, 33, 20, 0, 16, 0, 56, 0, 24, 8, 0, 53, 0, …
$ FU6  <dbl> 13, 17, 49, 34, 24, 23, 27, 21, 12, 20, 0, 14, 0, 0, 0, 9, 8, 0, …
$ FU12 <dbl> 14, 24, 23, 24, 25, 26, 24, 11, 0, 10, 0, 36, 5, 0, 0, 37, 26, 15…

Plot of the data, 1

Plot of the data, 2

Glimpse of the restructured and simplified data

Rows: 30
Columns: 4
$ BST           <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ id            <int> 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7,…
$ t             <dbl> 0, 6, 12, 0, 6, 12, 0, 6, 12, 0, 6, 12, 0, 6, 12, 0, 6, …
$ protected_sex <dbl> 22, 13, 14, 10, 17, 24, 36, 49, 23, 38, 34, 24, 25, 24, …

Break #2

  • What you have learned
    • Description of HIV-intervention data
  • What’s coming next
    • Random intercepts model using hiv-intervention data

Random intercepts analysis, 1

Linear mixed model fit by REML ['lmerMod']
Formula: protected_sex ~ t + (1 | id)
   Data: hiv_3

REML criterion at convergence: 215.1

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.8385 -0.6028  0.0364  0.5183  2.1998 

Random intercepts analysis, 2

Random effects:
 Groups   Name        Variance Std.Dev.
 id       (Intercept) 69.57    8.341   
 Residual             53.34    7.304   
Number of obs: 30, groups:  id, 10

Random intercepts analysis, 3

Fixed effects:
            Estimate Std. Error t value
(Intercept)  22.2333     3.3768   6.584
t            -0.2167     0.2722  -0.796

Random intercepts analysis, 4

Correlation of Fixed Effects:
  (Intr)
t -0.484

Live demo, fitting a random intercepts model

Break #3

  • What you have learned
    • Random intercepts model using hiv-intervention data
  • What’s coming next
    • Mathematical formulation of random slopes model

Random slopes model, 1

  • Same notation for the time and outcome variables
  • \(Y_{ij},\ i=1,...,n;\ j=1,...,k\)
    • n subjects, k time points
  • \(t_j\), time of jth measurement

Random slopes model, 2

  • \(Y_{ij}=\beta_0+u_{0i}+\beta_1 t_j+u_{1i} t_j+\epsilon_{ij}\)
    • \(\beta_0\) and \(\beta_1\) are unknown constants
    • \(u_{0i}\), \(u_{1i}\), and \(\epsilon_{ij}\) are normally distributed
      • \(SD(u_{0i})=\sigma_{intercept}\)
      • \(SD(u_{1i})=\sigma_{slope}\)
      • \(SD(\epsilon_{ij})=\sigma_{error}\)

Random slopes illustrated, 1

Random slopes illustrated, 2

Break #4

  • What you have learned
    • Mathematical formulation of random slopes model
  • What’s coming next
    • Assumptions and complications

Assumptions

  • Independence
    • Only between subjects
  • Normality
    • Residuals
    • Random intercepts and/or slopes
  • Linearity

Linearity check

Normality check within clusters

Normality check for random intercepts

Complications

  • Not a problem
    • Missing values
    • Better than Last Observation Carried Forward
  • Problems (more tedious than difficult)
    • Interactions
    • Nonlinear trends
    • Covariates
      • Between patients
      • Within patients

Live demo, checking assumptions

Break #5

  • What you have learned
    • Assumptions and complications
  • What’s coming next
    • Sample size justification

Effect of co-housing on sample size calculations, 1

Effect of co-housing on sample size calculations, 2

Effect of co-housing on sample size calculations, 3

Effect of co-housing on sample size calculations, 4

Sample size estimate without clustering

delta <- 4
sd_independence <- 11.1

power.t.test(
    n=NULL, 
    delta=4,
    sd=sd_independence,
    sig.level=0.05,
    power=0.8,
    type="two.sample") |> 
    tidy() -> sample_size_1

sample_size_1
# A tibble: 1 × 5
      n delta    sd sig.level power
  <dbl> <dbl> <dbl>     <dbl> <dbl>
1  122.     4  11.1      0.05   0.8

Sample size estimate with clustering, 4 animals per cage

deff <- 1+(4-1)*0.049
sd_correlated <- sd_independence * sqrt(deff)

power.t.test(
    n=NULL, 
    delta=4,
    sd=sd_correlated,
    sig.level=0.05,
    power=0.8,
    type="two.sample") |> 
    tidy()
# A tibble: 1 × 5
      n delta    sd sig.level power
  <dbl> <dbl> <dbl>     <dbl> <dbl>
1  140.     4  11.9      0.05   0.8

Sample size estimate with clustering, 8 animals per cage

deff <- 1+(8-1)*0.049
sd_correlated <- sd_independence * sqrt(deff)

power.t.test(
    n=NULL, 
    delta=4,
    sd=sd_correlated,
    sig.level=0.05,
    power=0.8,
    type="two.sample") |> 
    tidy()
# A tibble: 1 × 5
      n delta    sd sig.level power
  <dbl> <dbl> <dbl>     <dbl> <dbl>
1  163.     4  12.9      0.05   0.8

You could also just multiply the sample size by the design effect

sample_size_1$n * (1+(4-1)*0.049)
[1] 139.7624
sample_size_1$n * (1+(8-1)*0.049)
[1] 163.6451

Summary

  • What you have learned
    • Mathematical formulation of random intercepts model
    • Description of HIV-intervention data
    • Random intercepts model using hiv-intervention data
    • Mathematical formulation of random slopes model
    • Assumptions and complications
    • Sample size justification